National Repository of Grey Literature 127 records found  1 - 10nextend  jump to record: Search took 0.00 seconds. 
Mobile platform for testing of automotive systems in Bluetooth Hands-Free communication
Mecerod, Václav ; Stifter, Jiří (referee) ; Kratochvíl, Tomáš (advisor)
Tato diplomová práce se zabývá problematikou implementace Hands-Free komunikačních systémů v automobilovém průmyslu. První kapitola je zaměřena na teoretické aspekty zpracování řeči v embedded aplikacích, jako je potlačení šumu, potlačení akustické zpětné vazby a další faktory ovlivňující kvalitu Hands-Free systémů. Druhá kapitola obsahuje návrh kompaktního flexibilního mobilního testovacího zařízení pro bezdrátové komunikační Hands-Free moduly.
Network Interface for Keyword Spotting System
Skotnica, Martin ; Glembek, Ondřej (referee) ; Szőke, Igor (advisor)
A considerable part of the research in computer science is dedicated to speech recognition as the speech-controlled systems become useful in many applications. One of them is the keyword spotting which makes possible to find words in audio data. Such a detector is developed at BUT Faculty of Information Technology. The goal of this work is to propose a network interface to this keyword detector based on client/server architecture. Client connects to the server and sends audio data. Server runs keyword detector with this received data and sends the result of keyword spotting back to client. Finally client visualizes the result and interact with user.
Interpretability of Neural Networks in Speech Processing
Sarvaš, Marek ; Mošner, Ladislav (referee) ; Žmolíková, Kateřina (advisor)
S rastúcou popularitou hlbokých neurónových sietí, nedostatok transparentnosti spôsobenejich funkciou čiernej skrinky, zvyšuje dopyt po ich interpretácii. Cieľom tejto práce je získať nový pohľad na hlboké neurónové siete v úlohách spracovania reči. Konkrétne klasifikácia pohlavia z AudioMNIST datasetu a klasifikácia rečníka z filter bánk VoxCeleb datasetu s použitím konvolučnej a reziduálnej neurónovej siete. Na interpretáciu týchto neurónových sietí bola použitá metóda propagácie relevancií cez vrstvy. Táto metóda vytvorí tepelnú mapu, ktorá vyznačí príznaky, ktoré prispeli ku správnej klasifikácii pozitívne a ktoré negatívne. Ako výsledky interpretácie ukazujú, klasifikácie boli založené najmä na nižších frekvenciách v reči a čase. V prípade klasifikácie pohlavia sa mi podarilo nájsť vysokú závislosť modelu na veľmi malom počte príznakov. Pomocou získaných informácií som vytvoril rozšírený trénovací set, ktorý zvýšil robustnosť modelu.
Voice Conversion
Hodaň, David ; Novotný, Ondřej (referee) ; Černocký, Jan (advisor)
Voice conversion is the process of transformation of speech parameters belonging to one speaker in such a way that his/her speech sounds as spoken by someone else. This thesis presents a short summary of several techniques currently used for conversion. First, the theory of voice creation with an emphasis on key atributes that characterize and identify a speaker’s voice is described. Methods for voice modification are discussed, together with the advantages and pitfalls that predetermine the use-cases for suitable application of these methods. A high-level overview of how speech is transformed between the source and the target speakers is presented. This description is subsequently used to design a voice conversion system that is aimed to demonstrate one of the possible approaches to the conversion problem. The process of conversion consists of two phases: training and synthesis. As part of this project, a computer program for voice conversion based on the MATLAB programming environment has been developed. Its design, implementation and results are discussed.
Web-Based Lecture Browser
Žižka, Josef ; Mikolov, Tomáš (referee) ; Fapšo, Michal (advisor)
This thesis deals with a web-based lecture browser. Its goal is to facilitate the access to information with the use of modern speech and multimedia technologies. Technologies used for this browser are discussed. Video recordings play a very important role in the browser, and therefore the big portion of this work is aimed at the digital video and methods of its delivery using streaming servers. Solutions of similar multimedia browsers are mentioned. The reader is acquainted with the browser design. This includes describing the various components of the browser and how their mutual synchronization is done. The final version of the browser is introduced and the problems that occurred during the development process and deployment into service are mentioned. In the conclusion of this work the future development of the web-based lecture browser is discussed.
Module for Pronunciation Training and Foreign Language Learning
Kudláč, Vladan ; Herout, Adam (referee) ; Szőke, Igor (advisor)
Cílem této práce je vylepšit implementaci modulu pro mobilní aplikace pro výuku výslovnosti, najít místa vhodná pro optimalizaci a provést optimalizaci s cílem zvýšit přesnost, snížit čas zpracování a snížit paměťovou náročnost zpracování.
Web Based Audio/Video Lecture Browser: Porting of the Browser to MySQL Database
Janovič, Jakub ; Fapšo, Michal (referee) ; Szőke, Igor (advisor)
This project deals with a web-based lecture browser, whose goal is to simplify the gaining of knowledge with the use of multimedia. It presents an existing lecture browser that was created for a diploma thesis at FIT VUT Brno. Demonstrated are the technologies that are used and which will be used to migrate the browser to a MySQL database and to develop a transcription module for speeches. The reader will be acquainted with an analysis and model of the new application. Furthermore, implementation methods for development and subsequent testing are discussed. At the end of the project is a conclusion about the future development of web-based lecture browsers.
Vizualization of Outputs from Speech Technologies for Contact Centers
Zhezhela, Oleksandr ; Szőke, Igor (referee) ; Schwarz, Petr (advisor)
The thesis is aimed on visualisation of data mined by speech processing technologies. Some methods speech data extraction were studied and technologies for this task were analysed. The variety of meta data that can be mined from speech was defined. Were also examined existing standards and processes of call centres. Some requirements for the user interface were gathered and analysed. On that basis and after communication with call centre employees there was defined and implemented a concept for speech data visualization. Gained solutions were integrated into Speech Analytics Server (SPAS).
Learning the Face Behind a Voice
Zubalík, Petr ; Mošner, Ladislav (referee) ; Plchot, Oldřich (advisor)
The main goal of this thesis is to design and implement a system that will be able to generate a face based on the speech of a given person. This problem is solved using a system composed of three convolutional neural network models. The first one is based on the ResNet architecture and is used to extract features from speech recordings. The second model is a fully convolutional neural network which converts the extracted features into the styles which form a base for the final facial image. These styles are then passed as an input to the StyleGAN generator, which creates the resulting face. The proposed system is implemented in the Python programming language using the PyTorch framework. The last chapter of the thesis discusses some of the most significant experiments performed to fine-tune and test the developed system.
Keyword Spotting Implementation to Mobil Phone (Symbian 60)
Cipr, Tomáš ; Schwarz, Petr (referee) ; Szőke, Igor (advisor)
Keyword spotting is one of the many applications of automatic speech recognition. Its purpose is determining spots in given utterance in which some of the specified words were spoken. Keyword spotting has a great potential to enhance performance of new applications as well as the existing ones. An example could be a mobile phone voice control. Due to OS Symbian's coming to the market it is even possible for end user to implement a keyword spotting for a mobile phone on his or her own. The thesis describes theoretical prerequisites for keyword spotting and its implementation. Firstly the OS Symbian is presented with respect to the given task. Secondly each step of keyword spotting process is described. Finally the object design of keyword spotter is presented followed by implementation description. The thesis concludes with results review and notes on possible improvements.

National Repository of Grey Literature : 127 records found   1 - 10nextend  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.